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Christopher R. BEBBINGTON 
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RECOMBINANT DNA EXPRESSION 
VECTORS 

CARPR 0 037D2 



RESPONSE 



Assistant Commissioner of Patents 
Washington, D.C. 20231 

Sir: 



November 25, 1996 ^^O/Arj ' % 



In response to the Official Action dated July 24, 1996, 
Applicant respectfully submits that this Response is being timely 
filed under the Next-Business-Day Rule (November 24 , 1996 fell on 
a Sunday). Kindly consider the Response as follows: 



1. At the top of page 2 of the Official Action of July 24, 
1996, claims 6, 7 and 11-18 stand rejected as being indefinite in 
the recitation of "recombinant" . 

Applicant respectfully submits that the term "recombinant" 
has a well defined, art recognized meaning. Furthermore, claims 
to the vector identical to the instant transformed host cell 
claims have already been allowed. For the Examiner's convenience 
the following table presents the allowed claims from parent 
application 08/300,063: 



Claim Number 


Claim 


11 


A recombinant expression vector comprising the 
promoter, enhancer and complete 5' untranslated 

region including the first mtron of the hCMV- 
MIE gene operably linked to a heterologous 
coding sequence . 


12 


The recombinant expression vector according to 
claim 11 further comprising a restriction site 
to facilitate insertion of the heterologous 
coding sequence . 


13 


The recombinant expression vector according to 
claim 11 wherein the promoter, enhancer and 
complete 5' untranslated region including the 
first intron of the hCMV-MIE gene are linked 
directly to the heterologous coding sequence. 


14 


The recombinant expression vector according to 
claim 11 wherein the vector further includes 
the hCMV MIE gene's includes a translational 
initiation signal. 


15 


The recombinant expression vector according to 
claim 14 wherein the translational initiation 
signal includes the sequence 5'- 
GTCACCGTCCTTGACACCATG-3 ' . 


16 


The recombinant expression vector according to 
claim 14 wherein the translational initiation 
signal includes the sequence 5 ' -CCATGG-3 ' . 



2. On page 2 of the Official Action of July 24, 1996, claims 

6, 7 and 11-18 stand rejected as being anticipated by Whittle et 
al . (Protein Engineering, 1987). 

Applicant respectfully submits that the Examiner appears to 
have mistakenly read the publication year of Whittle et al . as 
1985 (see page 2, line 20 of the Official Action of July 24, 
1996). Review of the Whittle et al . publication indicates that 
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the actual publication year was 1987. Applicant respectfully 
submits that the Whittle et al. publication does not constitute 
prior art under any section of 35 USC 102. Although the copy of 
the reference provided by the Examiner does not set forth a 
publication date, page 505 of the publication in the last 
sentence after the "Reference" section, indicates that the paper 
was received for review on October 8, 1987 and revised in 
December 8, 1987. As the priority date for the instant 
application is July 23, 1987, the Whittle et al . reference does 
not constitute prior art under 35 USC § 102 because the article 
was not even submitted for review at the priority date of the 
instant application . 

Applicant respectfully directs the Examiner to British 
Application No. GB 8717430 to which priority is claimed under 35 
USC § 119. This document was submitted on December 1, 1995 in 
parent application 08/3 00,063 in which the issue fee was paid on 
May 22, 1996. For the Examiner's convenience, a photocopy of 
said application is attached as Exhibit A. Applicant 
respectfully requests official acknowledgement of the priority 
document submitted in parent application 08/300,063 and 
withdrawal of the above rejection. 

3. In view of the foregoing Remarks, Applicant respectfully 
requests the Examiner to reconsider and withdraw her rejections 
of the claims. Should the Examiner feel that an interview would 
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expedite the prosecution of this application, she is invited to 
call the undersigned at her convenience. 

The Commissioner is hereby authorized to charge any 
additional fees which may be required for this amendment, or 
credit any overpayment to Deposit Account No. 19-3700. 

An extension of time is being filed with this Response. In 
the event that the required fee for a one (1) month extension of 
time is not attached, or fees which may be required in addition 
to that requested in a petition for an extension of time, the 
Commissioner is requested to grant a petition for that extension 
of time which is required to make this response timely and is 
hereby authorized to charge any fee for such an extension of time 
or credit any overpayment for an extension of time to Deposit 
Account No. 19-3700 and to notify the undersigned accordingly. 



Respectfully submitted, 




John/ W . Schneller 
(Registration No. 26,031) 
SPENCER & FRANK 
Suite 300 East 
1100 New York Ave., N.W. 
Washington, D.C. 20005-3955 
Telephone: (202) 414-4000 
Telefax : (202) 414-4040 
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THE PATENT OFFICE QO[ n 
STATE HOUSE >k> fcf^ 

66-71 HIGH HOLBORN 
LONDON WQR 4TP 



REC'D 2 6 OCT !988 



WIPO PCT 



I. the undersigned, being an officer duly authorised in accordance with Section 62(3) of the 
Patents and Designs Act 1907, to sign and issue certificates on behalf of the Comptroller- 
General, hereby certify that annexed hereto is a true copy of the documents as originally 
filed in connection with the patent application identified therein. 

In accordance with the Patents (Companies Re-registration) Rules 1982, if a company 
named in this certificate and any accompanying documents has re-registered under the 
Companies Act 1980 with the same name as that with which it was registered immediately 
before re-registration save for the substitution as, or the inclusion as, the last pan of the 
name of the words "public limited company" or their equivalents in Welsh, references to the 
name of the company in this certificate and any accompanying documents shall be treated 
as references to the name with which it is so re-registered. 

In accordance with the rules, the words, "public limited company" may be replaced by 
p.l.c, pic, P.L.C or PLC. 

Re-registration under the Companies Act does not constitute a new legal entity but merely 
subjects the company to certain additional company law rules. 



>^ Witness my hand this ^ 
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Recombinant DNA Product and Processes Using it 
The present invention relates to DNA sequences 
which cause spontaneous high copy number 
incorporation of vector DNA into a host cell and to 
- .the uses thereof in recombinant DNA technology. 
The main aim of workers in the field of 
recombinant DNA technology is to achieve as high a 
level of production as possible of a particular 
polypeptide. This is particularly true of commercial 
organisations who wish to exploit the use of 
recombinant DNA technology to produce polypeptides 
which naturally are not very abundant* 

Generally, the application of DNA technology 
. involves the cloning of a gene encoding the desired 

polypeptide, placing the cloned gene in a suitable 
♦ expression vector, transfecting a host cell line with 
the vector; and culturing the transfected cell line 
to produce the polypeptide. Optionally, the process 
may include vector amplification stages in an attempt i 
to raise the production level. These steps are now 
well known and for the most part can be operated 
satisfactorily . However, there is still much 
uncertainty as to how much polypeptide will in the 
end be produced. It is almost impossible to predict 
whether any particular vector or cell line or 
combination thereof will lead to a useful level of 
production. 

There are in general two factors which 
significantly affect the amount of polypeptide 
produced by a transfected cell line. The first 
factor relates to the efficiency at which the cloned 
gene is transcribed and translated in the host cell. 
The present application is not primarily concerned 
with this factor. 

The second factor relates to the number of 
copies of the gene present in the transfected cell. 
If there is a large number of copies, an increased 
level of production can be exDected. th^tp h 
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a number of proposals for increasing copy number. 
The most commonly used is generally known as vector 
amplification, and is best exemplified by the DHFR 
system. 

DHFR is dihydrofolate reductase, an enzyme 
which confers on a cell line the ability to grow in 
the absence of nucleosides in the medium, m a 
typical DHFR-based amplification system, a dhfr" cell 
line is transfected with a gene encoding DHFR and a 
gene encoding the desired polypeptide. The 
transfected cell line is then grown in medium lacking 
nucleosides, cells which survive may contain both 
the dhfr gene and the desired gene. Surviving cells 
are then cultured in media containing increasing 
concentrations of methotrexate (MTX), a compound 
which binds to DHFR, thereby inhibiting its action. 
The surviving cells have amplified levels of the DHFR 
gene and concomitantly amplified levels of the gene 
encoding the desired polypeptide. 

While amplification systems have been 
relatively successful in increasing copy number, they 
are far from perfect, in that they require a number 
of rounds of culturing, which is very time 
consuming. There is therefore a need for a system 
whereby the copy number of a desired gene in a 
transfected cell line can be increased without the 
need for" laborious amplification procedures, 

A further problem with presently-known vector 
amplification systems is that initial transf ectants 
containing low copy numbers of the vector may not 
produce sufficient product for detection. Thus, 
identification of clones for subsequent amplification 
may be difficult. There is therefore a need for a 
system which enables transfected cell lines to be 
identified more easily. 

The present invention is based on the discovery 
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that the use of a particular vector to transfect a 
cell line led to the production of a transfected cell 
line having a surprisingly and unexpectedly high 
vector copy number, without the need to carry out any 
amplification procedures, it is nonetheless possible 
to carry out amplification in addition, to further 
increase vector copy number. 

The vector which led to this discovery is the 
vector pSVLGS.l. The structure of this plasmid is 
shown in Figure 1. The plasmid is based on the 
vector P CT54 (Emtage et al., PNAS-OSA, 80, 3671-3675, 
1983), and comprises the EcoRI-BamHI fragment 
v thereof. The remainder of the vector, going from the 
t EcoRI site to the BamHI site, comprises the SV40 late 
region promoter and a minigene encoding glutamine 
synthetase (GS) . The GS minigene comprises the 
complete coding sequence, a single intron and 
approximately 2kb of 3 '-flanking DNA spanning two 
presumed sites of polyadenylation. The preparation 
of this vector is described in detail in 
International Patent Application No. PCT/GB87/00039 . 

The sequences in the P CT54 vector derived from 
plasmid pBR322 and the SV40 late region promoter have 
both been used in many systems, without giving rise 
to any unexpected increase in copy number. Moreover, 
other vectors have been produced which contain the GS 
coding sequence, but not the intron, of the GS 
minigene. Such vectors have been used for 
transfection without achieving spontaneously high 
vector copy number, it has now also been shown that 
spontaneously high vector copy numbers can be 
achieved using a vector lacking the Pvul-BamHI 
fragment of P CT54. it is therefore believed that the 
DNA sequence(s) in the vector which gives rise to the 
surprisingly and unexpectedly high copy number is 
located: 



ueneva ui- 7033053230;* 7/5E 



-4- 



(i) in the intron; 

(ii) in the 3' flanking DNA; 

(iii) in a region bridging the SV40 portion and 
the GS coding portion; 

(iv) in a region bridging one or the other end 
of the intron and the coding sequence; 

(v) in a region bridging the coding sequence 
and the 3' flanking DNA; 

(vi) in the 5 '-untranslated region derived from 
the GS genomic DNA; 

(vii) in the region of the 5 '-untranslated 
region which is a cloning artefact; or 

(viii) in any combination of the above. 
This DNA sequence(s) is herein termed a 

"spontaneous high copy number sequence". 

Almost the entire sequence of the vector 
pSVLGS.l is known and is shown in Figure 2. The 
areas which correspond to those regions set out as 
(i) to (vii) in the paragraph above are marked. Work 
is at present being carried out to identify which 
particular region(s) is responsible for the ability 
of the vector to transfect a cell line with high copy 
number and to elucidate the mechanism by which the 
high copy number is spontaneously obtained. These 
experiments will merely be a matter of routine for 
the man skilled in the art and will lead to the 
identification of the exact sequence of the region(s) 
in question. 

According to a first aspect of the present 
invention, there is provided: the DNA sequence(s) 
from the vector pSVLGS.l which causes spontaneous 
high copy number incorporation of vector DNA into a 
host cell; or any DNA or RNA sequence which 
hybridises thereto under high stringency conditions; 
or any analog thereof. The spontaneous high copy 
number sequences according to this aspect of the 
present invention are hereinafter referred hn fB „ 



1 M 



ueneva Ln-» 



7033053230;? 8/59 



-5- 

The inventors conjecture that the SHCSs may 
comprise a pair of small inverted repeats which 
enable the formation of an intracellular molecular 
intermediate in gene amplification comprising two 
complete vector sequences in inverted relationship. 

It has also been conjectured by the present 
inventors that the SHCSs correspond to sequences 
found around the break points in tandem arrays of 
amplified genes or to the sequences of hypervar iable 
mini-satellite sequences* However, it is certainly 
not. the Applicants desire to be limited in any way to 
these conjectured rationalisations. 

If the present inventors 1 conjectures are 
correct then the SHCSs of the first aspect of the 
present invention will include sequences derived from 
the region around break points in amplified arrays of 
genes and sequences derived from hypervar iable 
mini-satellite sequences. The SHCSs may also include 
sequences found in repeated sequences of mammalian 
genome, such as the "Alu" repeats, which may form 
sites at which recombination events can readily take 
place. An alternative or additional hypothesis is 
that the SHCSs comprise or include a mammalian origin 
of replication. 

According to a second aspect of the present 
invention, there are provided vectors, in particular 
expression vectors, containing an SHCS. Preferably, 
the vector contains two such sequences but in 
inverted relationship. Alternatively, there is 
provided a pair of similar vectors containing a 
single SHCS, in each of which the SHCS is in the 
opposite orientation to its orientation in the other 
vector. It is believed that the use of such a vector 
or pair of vectors will enable a large inverted 
duplication to arise by homologous recombination 
within the host cell and hence induce amplification. 
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Preferably, the vector includes not only the 
SHCS but also a gene encoding a selectable marker. 
Such a vector will be of use in transforming a cell 
line in which the gene encoding the selectable marker 
is either absent or not expressed. A particularly 
suitable marker is GS. The use of GS as a selectable 
marker permits the survival of only those transfected 
cells which express a certain minimum level of GS 
which permits resistance to a certain level of 
methionine sulphoximine (MSX, a GS inhibitor). This 
is in contrast to other selection procedures (e.g. 
typically used selection procedures for DHFR or 
guanine phosphoribosyl transferase (GPT) genes) in 
which there is a less stringent requirement for 
efficient expression of the selected gene. It has 
been found that, using GS as a selectable marker, the 
frequency at which transf ectants are identified after 
transfection with GS encoding genes is substantially 
lower than the frequency obtained by using a DHFR, 
GPT or neomycin-resistance gene as the selectable 
marker, since only a sub-set of transf ectants (i.e. 
those which express higher than average levels of GS) 
can survive. Thus the use of GS as a selectable 
marker enables the selection, of high copy number 
transf ectants without the need to carry out any 
amplification stages. Nonetheless, the SHCSs could 
also be- used in combination with a DHFR encoding 
sequence and using MTX as the selection agent. 

Preferably, in expression vectors according to 
the second aspect of the present invention, the 
coding sequence is placed under the control of a very 
strong promoter to direct expression of the coding 
sequence. Advantageously, the promoter-containing 
fragment also includes sequences which allow 
efficient translation of the mRNA from the coding 
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sequence, in a particularly preferred embodiment, 
the coding sequence is placed under the control of a 
sequence comprising the promoter-enhancer and 
complete 5' untranslated region of the major 
immediate early gene of human cytomegalovirus 
(hCMV-MIE) . 

Preferably, the CMV-derived sequence includes 
both the first splice donor and splice acceptor site 
of the MIE gene and a sequence similar or identical 
to a concensus translation "start" signal. 

The present invention also includes host cells 
transfected with the vectors according to the 
invention and processes for the production of a 
desired polypeptide by culturing such transfected 
cells . 

According to a third aspect of the present 
invention, there is provided the use of the hCMV-MIE 
5* untranslated region linked directly to the coding 
sequence for a desired polypeptide, for directing the 
translation of mRNA • It is believed that this 5 f 
untranslated region is surprisingly efficient in 
directing mRNA translation* 

The present invention is now described, by way 
of example only, with reference to the accompanying 
drawings, in which, 

Figure 1 shows the structure of the vector 
PSVLGS.L; 

Figure 2 shows the nucleotide sequence of the 
GS sequences in the vector pSVLGS.l, from the 5 f 
EcoRl site to the Hindin site near the 3* end; 

Figure 3 shows the structure of the vector 

pHT • 1 ; 

Figure 4 shows the complete sequence of the 
hCMV-MIE promoter enhancer region including the first 
intron and a modified translation "start" site? and 
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Figure 5 shows the structure of the vector 
PEB6HLC.HHC.GS. 
Example 1 

The construction of the vector psVLGS.l is 
shown in international Patent Application No. 
PCT/GB87/00039, a copy of which is enclosed with the 
present application. The International Application 
also shows the production of two other vectors 
comprising a GS coding sequence. The other two 
vectors are called pSV2.GS and pZIPGS. The structure 
of the pSVLGS.l Vector is shown in Figure 1 of this 
application and also in Figure 3A of the 
International Application, it can be seen from 
present Figure 1 that pSVLGS.l includes a pCT54 
derived section, the SV40 late promoter region, a GS 
coding sequence, including a single intron, and about 
2 kb of 3 • flanking DNA . 

The structures of pSV2 .GS and pziPGS are shown 
in Figures 3B and 3C respectively of the 
International Application. In these vectors, the GS 
is encoded by a cDNA portion, lacking both the intron 
and the 3' flanking DNA. Moreover, the GS coding 
sequence is under the control of the SV40 early 
region promoter. 

The results given in the International 
Application clearly show that the pSVLGS.l vector is 
incorporated in very high copy number into a host 
cell merely on transf ection, and in the absence of 
amplification. The differences in the structures of 
the pSVLGS.l and the other two vectors, in particular 
the pSV2. GS vector, lead to the conclusion that the 
SHCS in the pSVLGS.l vector is present in one or a 
combination of the seven regions enumerated above. 
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Example 2 

A series of experiments was carried out to 
compare a number of vectors containing selectable 
markers. The three selectable markers used were GS, 
DHFR and GPT ♦ In order to compare GS, DHFR and G?T 
selection, a cDNA encoding tissue inhibitor of 
metalloproteinase (TIMP) was used as a "reporter" 
gene. TIMP expression levels were studied from cell 
clones selected by each of the three methods. 

The basic vector used in these experiments is 
PHT.1 which is shown in Figure 3. In this vector, 
the transcription unit used to direct TIMP expression 
contains the hCMV-MlE promoter including its complete 
5" untranslated region fused by means of a single 
base change directly onto the Ncol site at the ATG 
representing, the first amino acid of the TIMP coding 
sequence/ This promoter-enhancer-leader fragment was 
made by adding an oligomer which recreates the entire 
5' untranslated sequence to the Pst-lm fragment of 
hCMV. The complete sequence of the promoter-enhancer 
region and 5' translated region including the first 
intron and the modified translation "start" site is 
shown in Figure 4. At the 3 1 end of the TIMP cDNA 
fragment is the SV40 early polyadenylation signal. 
At the 3 1 end of this transcription unit is a unique 
BamHl site that was used to insert either i) the 
PvuI-BamHI fragment of pSVLGS.l (which contains the 
GS minigene), ii) the PvulI-BamHI fragment from 
pSV2.dhfr (which contains the dhfr cDNA) or iii) the 
mouse metallothionein mMT-1 gene. Hence were derived 
three vectors pHT.lGS, pHT . 1DHFR and pHT . 1MT ♦ In 
each case, both genes on the vector were in the same 
orientation. pHT.lGS and pHT . 1DHFR were transfected 
into Chinese hamster ovary (CHO) Kl and dhfr" CHO 
cells respectively and transf ectants were selected 
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for i) resistance to 20 uM MSX (pHT.lGS) or ii) dhfr + 
phenotype ( pHT . 1DHFR) . pHT.lMT was co- transf ected 
with pEE6GPT (a vector containing the bacterial 
xanthine-guanine phosphor ibosyl transferase (gpt) 
gene) into CHO-Kl cells and transf ectants were 
selected for resistance to 5ug/ml mycophenolic acid 
in the presence of xanthine, hypoxanthine and 
thymidine. 24 colonies were isolated from each of 
the three transf ections . These *were grown-up and 
assayed for TIMP production. The results obtained 
are shown in Table 1. 



secreting TIMP 

Several clones secreting the highest levels of 
TIMP from each transfection were grown to equivalent 
cell densities and TIMP secretion was assayed. The 
results are shown in Table 2. 



TABLE 1 



No of cell lines 



PHT.1GS 
17/24 



pHT . 1DHFR 
17/24 



pHT. ImMT 
9/24 



TABLE 2 



Clones 




GS Timp 10 
GS Timp 14 
GS Timp 15 
GS Timp 19 
dhfr Timp 1 
dhfr Timp 3 
dhfr Timp 4 
dhfr Timp 5 
mMT Timp 1 
mMT Timp 2 
mMT Timp 3 



4.0 



3.0 



0.7 
0.58 



4.8 



1.5 



0.8 



8.5 



9.0 



5.5 



5.5 



ENV.PAR=WIPO / OMPI PCT ; 23-1 1-35 : 20=45 : Geneva CH- 7033053230 :#14/53 



- 11 - 

The GS TIMP clones produced 2-3 times more TIMP 
than mMT TIMP clones, and about 10 times more than 
dhfr TIMP clones. 

The cell line secreting the most TIMP from each 
transfection was cloned by limiting dilution and the 
specific production rates for the best clone in each 
case were as follows:- 

GS TIMP 19.12 lOug/10 6 cells/24 hours 

dhfr TIMP 3.6 0.75ug/10 6 cells/24 hours 

mMTTIMP 5.8 4ug.l0 5 cells/24 hours 

Selection using a vector including the SHCS 
dervived from the GS minigene and GS as a selectable 
marker allowed the identification of clones producing 
substantially higher levels of TIMP than were 
obtained using either of two alternative selectable 
markers DHFR- or GPT . 

Clones from the cell line GS TIMP 19. could also 
be selected for gene amplification by culturing in 
500uM MSX and a cell line was obtained which secreted 
100ug/10 6 cells/24 hours. 
Example 3 

An expression vector designed specifically for 
the expression of immunoglobulin genes in CHO cells 
was constructed. The structure of the expression 
vector, which is called pEE6HLC.HHC.GS, is shown in 
Figure 5. It contains the following DNA sequences: 
immunoglobulin light and heavy chain genes under the 
control of the human cytomegalovirus immediate early 
gene promoter and SV40 early gene polyadenylation 
signal; the GS minigene from pSVLGS.l under the 
control of the SV40 late gene promoter; a bacterial 
origin of replication; and the amplicillin resistance 
gene. Following the introduction of plasmid DNA into 
CHO cells by calcium phosphate co-precipitation, 
colonies were isolated which were resistant 20um 
MSX . These cell lines were subjected to a further 



EiW.PAR-WIPO / OMPI Pa 

V 



;23-ll-95 ; 20:45 ; Geneva CH- 7033053230; #15/59 



- 12 - 

selection in 200uM MSX. Rates of antibody secretion 
were measured for each cell line and gene copy number 
was estimated from Southern blots of genomic DNA. 
The results are shown in Table 3. 

The initial transfected cell lines have a copy 
number of at least fifty per cell and after a single 
round of selection for GS gene amplification this 
copy number increased to approximately five hundred. 
This increase in copy number is accompanied by a 
00-20 fold increase in the rate of antibody secretion. 

TABLE 3 



Specific Pro- 
duction Rates 



cell 






(g/10 6 cells/ 


No. of vector 


Line 




Characteristics 


24 hours) 


copies/cell 


36 




Unamplified pool 


0.077 


50 


36.5 




Unamplified clone 0.125 


50 


36 .1 


. i 


Amplified pool 


2.9 


500 


36.1 


. ii 


Amplified pool 


0.075 


100 


36.1 


I 


Amplified clone 


3.19 


500 


36.1 


J 


Amplified clone 


1.28 


50 



36.1 ii is the amplified pool after culturing for 2 
months in the absence of MSX « 
Example 4 

In order to test whether the SHCS in the GS 
minigene can be used to obtain transfected clones 
expressing a linked gene more efficiently than 
vectors lacking the GS minigene, the following 
vectors were introduced into a dhfr~ CHO cell line: 
1) pSV2.dhfr, which is a widely used selectable 
marker in this cell line, conferring the ability to 
grow without added nucleosides; 2) pSV2dhfrGS3, 
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which contains the GS minigene (specifically the 
sequence between the Pvul and BamHI sites of 
pSVLGS.l) inserted at the BamHI site of pSV2.dhfr 
such that the GS and DHFR genes are in opposite 
orientations; 3) pSV2dhf r .GS6 , which is identical to 
pSV2dhfr.GS3 except that the DHFR and GS genes are in 
the same orientation; and 4) pSV2dhf r . nel3 , which 
contains a gene which confers resistance to the 
antibiotic G418 inserted at the BamHI site of 
pSV2.dhfr such that the two genes are in opposite 
orientations . 9cm petri dishes containing at least 
10 6 cells were transfected with (i) 5ug or (ii) lOug 
of each vector by calcium phosphate co-precipitation 
and the cells were allowed to recover in a 
non-selective medium. After 2 days, the medium was 
replaced by DMEM medium containing 10% dialysed FCS 
and 150ug/mi proline, to select for dhfr + 
transformants, or G418 in Ham's F12 medium to select 
for expression of the ne gene. To some dishes, 
methotrexate was also added to serve as an assay for 
the amount of DHFR enzyme produced, 9 days after 
transf ection, the number of colonies on each plate 
was scored. 

The results from two independent transf ections 
are shown in Table 4. For transf ection (i), 5ug of 
each plasmid was introduced into 10 6 cells on each 
dish- In transfection (ii) lOug of DNA was used and 
the number of cells per dish was 4 x 10 6 . 

The concentrations of plasmid DNAs were 
carefully measured by absorbance prior to 

transfection. 



EIW.PAR:WIPO / OMPI PCT ;23-ll-95 ; 20^46 ; Geneva CH- 7033053230 :#17/59 



- 14 - 



Plasmid 



pSV2dhfr (i) 
Ui) 

pSV2dhfrGS3 (i) 
(ii) 

pSV2dhfrGS6 (i) 
(ii) 

pSV2dhfrnel3 (i) 
(ii) 



pSV2dhfrne!3 (i) 



TABLE 4 

no of surviving colonies/10. 6 , cells 
MTX 

50nM {%*) lOOnM (%) 



OnM 

61 
240 
65 
1400 
90 
1800 
26 
152 



13 
58 
90 



(1.3) 1 
0 

(4) 



(5) 
(-) 



5 
0 



G418 (0.8mg/ml) 
77 



(1.6) 
<-) 
(5.5) 
(-) 



*%: the percentage of the total dhfr + transf ectants 
which are resistant to a given level of MTX. 

The results show that the presence on the 
vector of a GS-minigene leads to the survival of a 
greater number of dhfr+ transf ectants and a greater 
proportion of these are resistant to high levels of 
MTX (50nM or lOOnM) than if a vector lacking the 
GS-minigene is used. 

This effect is observed only when the 
GS-minigene is in the same orientation in the vector 
as the DHPR gene, probably because convergent 
transcription when the genes are in opposite 
orientations is inhibitory for mRNA synthesis. This 
can also explain why the introduction of an 
irrelevant gene, the ne gene in pSV2dhf rnel3 , in the 
opposite orientation leads to the survival of fewer 
dhfr + colonies than when psv2dhfr is used for 
transf ection. 
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The stimulating effect is not observed when a ne 
gene is inserted into pSV2dhfr, indicating that the 
enhanced expression of DHFR is a specific effect of a 
sequence(s) in the GS-minigene. Since an equal 
weight of each vector DNA was used for transfection 
and because pSV2dhfrGS6 is approximately twice the 
size of pSV2dhfr the stimulating effect is in fact 
observed even when the number of introduced molecules 
is only about half the number of pSV2dhfr molecules. 

It can thus be seen that the use of the SHCS from 
the GS minigene leads to the production of 
transf ectants having surprisingly and unexpectedly 
high copy number. The advantages of the use of such 
SHCSs in recombinant DNA technology are readily 
apparent - 
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. 4 rr.'-- -■: " LTcr-i. 'aAGGAATC CG CftTr.r.n^f.^-- _ 

; r, ...s^. CACAC TTCCTTACvC GTACCCTC7A 



^= I *> 

7*2?-0 7ft SO 79i":0 '/' Ir* J 0 79?0 



C CTCTC-G G TGCCCCGTTT C ATCT7GCAT CCAG7A TGTi". AAt;AC7TT'. : C CC'TAATAG GA _ 

cTA'c.Ac'ATrUc acci.-g;.'::aaa gtAg aatgta GC7CATACA-: i tctcaaacc ccatt-akjgt" 



7330 7540 .''5:50 7S60 ?970 7 9 SO 

acctttgac: c_ c c a ac :: ■:_ r. a_j tc c: tgcg aac to g aatggt <- "a ggct g gca Tucck&r ttt 

TCGAAACTOG GGTTCGG G t A* MCGACC 1 ' 'Yl'G "ACCTTACCAT" GTGCOACCGT ATGCTTCAAA 

7930 £000 5010 \j£020 f'i?30 50*0 

A_GCACCA_ACG CCAT CCCG GA CCAGAATCGT GTGAAC 7AAC TAGGCTCCTA TOOAC.GA TCT 
TCG7GGTTCC" ' CG7AC::CjTct" CCTCTTACCA GACT7CAT?<; ''.TCCGAGGAT rtCCTCCTAGA 



S050 i?,GS0 fii070 80SC ft). 00 
TTG7TC7C - - -- -- 



A AC A AG AG- 



£110 nl2Q tO30 8140 iUfiO .HlfiO 



Si 70 Hi 80 ii ISO S20G 8210 8220 



£230 --.7.40 S250 32f\> ^270 3230 



S290 8300 U3i0 8320 83*0 T,AJi 8340 

, __ __ - CA CC7GTC70AT 

- " - - - - GT CGACACACTA 



S350 Z«GC S37fl AfoaJlL S j, ft . 5 aggo ?400 

ATCCGAAAG7 7G7CCACGTA CCTTCAAGCA 7TTAAAC GTC TTTAG7AAGA AC7AAA7ACA 

TACCCTTTCA ACAGG I, GOAT CCAAGT7CGT AAA TTTCGA*- 'VAATCAT7G7 TCATT TA7GT 

S410 S420 ti430 84-.C 84r>0 S460 

CATACAAGCA AG7GG : - .~V- AG TTAAT7C77A CTGATGGGAA CIAGGGCACTG ATGGGC.G7CT 

G7ATGTTCGT 7CACCCACTC AATTAACAAT GACTACCGTT CTCG.GGTCAC TACCCCCAGA 

ft 4/0 £48c S49o * si;:..:- ' st>\ •:• asso 

"CCATCCAAA AGATAAT7G0 TATTACATGT 7AC*<:CGAGA7 OCTCTGAAfUi ACTTACJC AGC 

AGGTAG(.*7 1*7 7C?A7:'AACC AYAATC7ACA ATCGCCTCTA TCAGACTTCG TCAA7CCTCG 

3530 ,?.;540 S550 3570 ,?,SS0 

ACATAG7 AGA CGGAC. ■'-'»£ ACA CCTCCGACTA ACGTATTTAT rCG7T7C7*»*A CAAGTCATCG 

TG7ATCA7G7 GCCTCTCTCT GCAGGC7GAT TCCATAAA7A ACCAAAGAA7 ATTCAGTAGG 
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36 50 fle«0 »<5' T ° ft6flC 55 *° 3?00 ~" 

T ' AfCATTC ACi ACAI-A'- CT C ACAAGGCAJ^ACAC AGC G7' ; ^.ATC AATA77 TA7TG I 7TGT 

ALTTCCTAAC TCTGTATCCA G7G7TC'"G'~C TCTGTCCGAC CT«CT?ATAA rtTAACAAACA 

jj 710 .=,720 IS/*"' f.'.'40 1*750 a~60 

TGAACTCATG CCTCCCTCCT CCCCCTTGAA GGACAGGT7T CCTAGGTGAC AAGCTCAGAC 

AGTTGA('taC GCACC-ACCA CGGGCAACTT CCTCTCCnAA SCATCCACTC TTCf:AC?CTG 

3770 a/80 ii$?Q S^ 02 * 10 a82 ° 

CvTCATTTT AC7GC rTC A r.r^r.r./^ CC AGA'I CGAGC- A SCGCATGGSC '\AA(;7*\AGC A 

GGAGTGGAAA TGACGAAGCT GGTCGGTCCC TCTAGCTCCT CCCGTACCTC TTTCATTCGT 

88^0 KfnX *i»o saso saeo ' • "as/o aftftQ* 

AGC GGC ACCG C7ACCACATT CCAGCC TACG_ATCCCAAOGG J^C^I^£fi.. ^T££££jLXg 

tCCCCCTGGC C A7 GG l'GTAA c;CV(,^ ATCC TACCGTTCCt: l^CGGAGC rc. T I Av.Gv.lCA,. 

8890 S900 iiSVO ©20 S 93<fYf" " 8940 

G7CTCAGTGG GTTCC'-KOAA ACGTCCAACA TCAACGCA7T l-CTGCTCCr L-TGGCGAAGT ^> 

CACAC7CACC CAAGGTGC7T TGCAGCT7G7 ACTTCCGTAA AAGACGACCA GAGGCG77CA 



oq. 50 s»60 6570 S9S0 ittSO 9000 

V CGAATCCATC CGAT7GCGAG AGAATTATTA AGACGCGCCG TCTCCAATG? C-ACCTTGCAO 
•;• CCTTAGCTAC GCTAAlSctG TC77AATAA? T<;TGCGGGG<« AGACGTTACA CTCGAAcCTC 

g 0l0 3020 *030 30*0 9050 . 30S0 

TGACAGAAGC G A7CG I GOCC ACATGCSTTC TCAATGAGAi' i'GGCCACCGA OCCCT TCCAA 
ACTGTCTTCG GTACCACCCG TCTACGCAAG ACTTACTCTG ACCCC7GCCT CGGGAAGCTT 

L 

507 0 5090 .3100 H*i!0 91*0 

TA _. - CAA AAACfTA/^TA GACT7TGAGT 

wT _ .-- ••- C7T r?TGATTAA7 f-TGAAACTCA 

gj 30 Si40 *150 Sll*^ 9170 9180 

CATCTTCAGC C777Cr/TACT TCATGCCACC Cr.Gr.C GGftG<- rGT CTCATTG TAACTCAAAG 
CTAGAACTCG GAAAGGATCA AG7ACGG7CG GGCGCCG7CG ACACAGTAAC ATTGAGTTTC 

9 , go 5X00 ^210 ^220 9230 

CATCCAA7A7 CAACGGTCTT TTTATTCCTC GTGCCCAGTT AATCCTTGCT TTTATTCCTC 
CTACCTTATA GTTCCSAGAA AAATAAGGAG CACCGGTCAA r.'AGCAACGA nAATAACCAG 

3250 9?SG *270 928C 3290 3300 

HGAATAGAGG AGTCA^.GTTC TTAATGC1CTA 7ACACCAAC< IC AT: TCTT7 TC7AT PVAGC 
7C77A7CTCC TCAG1TCAAG AATTACGGAT ATGTGCTTGG AGTAAAGAAA AGATAAATC*.. 



9310 9«?0 S330 

77TC7ACC7C GGGG'l GGGAG GGGTAGGGAG 
AAAGA7GCAG CCCCArXCTC CCCATCCCTC 



9340 33ti0 
GCGTAGCCGA AGGCAACCTA ACCACATGCT 
CCCATGCCC-T MCCTTGCAY TCGTGTACGA 



?4iO 
T77ACACCCT 
AAATCTCCCA 



ACACATCC7A 
TCTCT'-CCAT 



55 JO 
GCCTAG^TTA 
COCATGCAAT 



AACAGAAACT 
TTCTCTTTCA 



TATTTCTCv:? 
*\7AAAGACCA 



CCTCCAGT7A 
GOAGGTCAAV 



9550 3550 953° 9^3 3600 

TAACAC AACC AGATC'ATATT TTATA T7TAA AT CTAAAmA<- 'AAAACTTATA TA7ATCATAT 
ATTCTGTTCC TCTACTATAA AATATAAATT TACATTTTTC TTTTCAATAi A TA I A«. i ATA 



3610 *5iSiO S630 

G7CGATATAT GTGTATT7CT AATTCA^AAA 
CACCTATATA C ACAT-.AAGA TTAA0TCT77 



3540 i. 96-iO 
CCATCCTAGT TACTGGCTTT CCCAAGTTTG 
CGTAGGATCA '->.7GACCCAAA CCGTTCAAAC 



9670^?^ 9£80 **90 
AACAGCTTCQ_UjAAC/";A r .;AA . AGGATC7CTT 
TTCTCGAACC AA7TGTTCTT TCCTAGAGAA 

cir * — I 

9730 r ^74Q 37?>0 

J TTATCT ggggctcagc cctttattac 

CACCAATAGA CCCl.:G'-'-.GTG(i CGaAATAATG 

9790 9S00 53 10 

fiGCAGATGCT GGACA7.GTAG CCAGGG7GGG 
TCCTCTACCA CCTCTCCATC GGTCCCACCC 



9S50 r -3fc0 SS70 

C7TAGCCCTA AGATGCA7AT CTATCCACAC 
GAATCCC-GAT TC TAG "7 AT A CA7AGG7CTG 



*970C 9710 !3720&t 

GAG TAGACGT SGGGGTGCrtO 1 AGGAGGAAA 
CTCATGTCCA CCCCCACGTC ATGCTCCTTT 



3?r0 97 9780 

TATGTGGGGT TTCCCTCCCf. ACTXICCAGG 

ATACACCrr A MAGOG AC GOG TG AGACGTCC 

t ,5 <: t — > cioi 

982C 985i0 9340 • 

ACACAGTGCT v'GCCACCACC TG7CCC7GTC 
TCTCTCACCA AUCCTGCTGG ACAGCGACAC 

9A50 ^ft5'j f<9Q0 

ACACTTAGCA GGATGCAGTT CGCTGGZHA& 
TCTGAA7GC-T SCTACCTChA 'jCGAGGAGTT 



9910 9920 9930 99-G 9930 9960 

CTTCAACATT TC7TAC7GAT AGGCGTGGTG GGTT7ATT7T j'TGGTGCC «T nGCATGTCAC 
GAAC7TGTAA ACAATGACTA TCCCGACCAC CCAAATAAAA AACCACCGTA TCGTACAGTG 

9970 *i98f« 9990 ^$10000 jOO-0 J 3020 

A-AAAGCAGC CCTTTGATAT ATTAAATTTT TTTAAAC CAA ACATGTTCAG CTTTATCACC 
TATTTCCTGC GGAAA::7ATA TAATTTAAAA AAATTTGC7T ""-TAGAAGTC GAAATrtGTGG 
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AACATCCCAA ACATCAA -- -- - " 

1003 0 1C1O0 J CI 10 lUl/'O i 0 J 30 J 0140 



;2j-n-ao ; 211:4/ 



ueneva ui-> 



-.o?jo i:::*:;-:o 

ACC1 

TGG4 - ■ - - - 



:■. ;;2^o 



1 u?-to 



10260 



10?70 



10290 



10300 



i 03). 0 



0320 



:;3fju 



1 U?.r -J 



j 03P.O 



03SQ 



10400 



1 0* 10 



i 04;-; 



.\0440 



1C.460 .'.;!470 3 0 <=••=• 0 :0450 1 C> T.. C> 0 

•CCATC TACGATATCT TCGATTAGA7 AACTCCTCAT OTAAACAGAC 
"GTAC ATGCTA7ACA A«:CTAAT-r:?A r?f;ACC,AGTA (;ATTTG7<;TG 



10510 105520 10530 i 0'5-*C -.0550 10560 

r ''AACTGCC AGAGCiC-i'GC TTATAAATOA ACCTAACATT I ATAAGATTT <;C7CT PGACT 
AlATTGACGG tctcctcccg AATATTTAGT tggattgtaa A7AT7CTAAA ggagaactga 



j 05 "0 1 :-.5>flO j -590 1 OPC: 0 i OS ; 0 1 D620 

TC-TTTCTTTC 70G7'l GGGGG ACCAACAAAA AAAAAACTGC GATAT7T7T7 TGTTCCTTCA 
flOAAAGAAAC ACCA-rXOCC TGGTTCTTTT TTTT7TCACG ::7AYAAAAftA ACAACCAAGT 



10530 10B40 10SSO IOdSO i06/'O ttu^ 1 ^- 4 

'7TCC7A7CA AftAC-A-";AGGC CAG7GG77CT CTT'lTi/flTA r?t';GC:AAAi\V rtAGCTT 



'. AAAGGATAG7 TTTCTTTCCC CTCACCAAGA CAAAACAAA7 GAGCGTTTTA TTCGAA 
L -LC-T 



i 1 n n i. shed 



T ; 

DNS P t AN 

SL't i h ri 

aoa t 3 lu 

HI 12 31 

// 

CCATGGTGTCAAGGACGGTGACTGCAGTGAAThhTAAAATGTGTGTTTGTCCGAAhTACG 
! + + + + f + 6C 

GGTACCACAGT7CCTGCCACTGACGTCACTTATTATTTTACACACAAACAGGCTT7ATGC 

CGTTTTGAGATTTCTGTCGCCGACTAAATTCAVGTCGCGCGATAGTGGTGTTTATCGCCG 
6i + + + + + + 120 

GCAAAACTCTAAAGACAGCGGCTGATTTAAGTACAGCGCGC7ATCACCACAAATAGCGGC 

C 
1 

. 3 
1 

ATAGAGATGGCGATATTGGAAAAATCGATATTTGAAmhTATGGCATATTGAAAATGTCGC 

121 + + + + + + ISO 

7A7CTCTACCGCTA7AACCTT77TAGCTA7AAAC7T7 7ATACCGTATAAC7TT7ACAGCG 

E 
c 
o 
R 
V 

CGATGTGAGTTTCTGTGTAACTGATATCGCCATTTTTCCAAAAGTGATTTTTGGGCATAC 
-81 - + . + + + + i-f o^O 

GCTACACTCAAAGA.CACATTGACTATAGCGGTAAAAAGGTTTTCACTAAAAACCCGTATG 

E 
c 

o 

R 
V 

GCGATATCTGGCGATAGC66CTTATATCGTTTACGGGGGATGGCGATAGACGACTTTGGT 
241 + f + + + * 300 

CGCTATAGACCGCTmTCGCCGAATATAGCAAATGCCCCCTACCGCTATCTGCTGAAACCA 

GACTTGGGCGA77C7GTGTGTCGCAAATA7CGCAG777CGA7A7AGGTGACAGACGA7AT 
301 — + + + + 4 + 260 

CTGAACCCGCTAAGACACACAGCGTTTATAGCGTCAAAGCTATATCCACTGTtiTGCTATA 

C HH N C 

f aa s I 

r ie is 
1 11 11 
/ 

GAGGCTATATCuCCGATAGAGGCGACATCAAGCTGGCACATGGCCAATGCATATCGATCT 

361 t + - + + + + 420 

CTCCGATATAGCGGC TATCTCCGCTGTmGTTCGACCGTGThCCGGTT ACGTATAGCTAGft 
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S C SK 
y f aa 
p r le 

i in 

ATACATTGAATCAATATTGGCCATTAGCCATATTATTCATTGGT7ATA7 AGCATAAATCA 
421 + + + + ■<• + 430 

TATCTAACTThGTTATAACC6GTAATCGGTATAATAAGTAACCAhTATATCGTATTTAGT 

< 

S C EH 

S f 33 

p r le 

1 1 11 

/ 

ATATTGGCTATTGGCCATTGCATACGTTGTATCCATATCATAATATGTACATTTATATTG 
4S1 + + + + + 540 

TATAACCGATAACCGGTAACGTATGCAACAThGGTATAGTATTATmCATGTAhATATAAC 

K 

1 H S 
n Hi p 
c e e 

2 1 1. 
GCTCATGTCCAACATTACCGCCATGTTGACATTGATTATTGACTAGTTATTAATAGTAAT 

541 + + + f — + • -r+ 600 

CGAGTACAGGTTGTAATGGCGGTACAACTGTAACTAATAACTGATCAATAATTATCATTA 

CAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGG 
oOl + 7 + + + + + 6&o 

GTTAATGCCCCAG'T.AATCAAGTATCGGGTATATACCTCAAGGCGCAATGTATTGAATGCC 

B A A 

£ ha 
1 -at 
1 2 2 

TAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAArAATGACGT 

661 + f + + t + 720 

ATTTACCGGGCGGACCGACTGGCGGGTTGCTGGGGGCGGGTAACTGCAGTTATTACTGCA 

A A 
h a 
a t 
2 2 

ATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGThTTTAC 

721 — + +— + + + + 7S0 

TACAAGGuTATCATTGCGGTTATCCCTGAAAGGTAACTGCAGTTACCCACCTCATAAATG 

B H 
si d 
1 e 
1 • 1 
GGTmhhCTGCCCACTTGDCAoTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTG 
731 + + - + + + — + 840 

CCATTTGACGGGTGAACCGTCATGTAGTTCACATAGTATACGGTTCATGCuGSGomTAAC 
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ACGTCAATGACGGTriHATGGCCCGCCTGGCATTATGCCCAGTACATQACCTTAT6GGACT 
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TGCAGTTACTGCCATTTACCGGGCGGACCGTAATACGGGTCAT6TACTGGAATACCCTGA 
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TTCCTACTTGGC AGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTT 
901 + + + + + + 960 

AAGGATGAACCGTCATGTAGATGCATAATCAGTAGCGATAATGGTACCACTACGCCAAAA 



GGChGTACATCAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACC 
961 + + + + + ~ - + 

CCGTCATGTAGTTACCCGCACCTATCGCCAAACTGAGTGCCCCTAAAGGTTCAGAGGTGG 



1020 
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3 
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t 
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n 
1 



CChTTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTC 
2i + + + + + + 1030 

GGTAACTGCAGTTACCCTCAAACAAAACCGTGGTTTTAGTTGCCCTGAAAGGTTTThCAG 



GTAACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATA 

10S1 + + +. + + + 

CATTGTTGAGGCGGGGTAACTGCGTTTACCCGCCATCCGCACATGCCACCCTCCAGATAT 



1140 
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FssfS G A 

sp i 3 s h 

n 1 A c us 

2211 1 2 

/// 

TAAGCAGAGCTCGTTTAGTGAACCGTCAGATCGCCTGGAGACGCCATCCACGCTGTTTTG 

1 + + + + __; + 

ATTC6TCTCGAGCAAATCACTTGGCAGTCTAGCGGACCTCTGCG6TAGGTGCGACAAAAC 



1200 



H 

B D BCGsSX 

b s Sfdfsm 

v a IriBca 

2 1 112223 

//// 

hCCTCCATAGAAGACACCGGGACCGATCCAGCCTCCGCGGCCGGGmACGGTGCATTGGAA 

1201 T T + T + + 

TGGAGGTA7CTTCTGTGGCCCTGGCTAGGTCGGAGGCGCCGGCCCTTGCCACGTAACCTT 



1260 
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»■■' , -* • - - 

CGCGGATTCCCCGTGCCmhGAGTGACGTAAGTACCGCCTATAGmGTCThTmGGCCCACCC 
1? > x + + + + __ + 132() 

GCGCCTAAGGGGCACGGTTCTCACTGCATTCATGGCGGATATCTCAGAThTCCGGGTGGG 

B N 

Ss N sS 

tt s pr> 

yX i Hh 

11 1 11 

/ 

CCTTGGC7TCTTATGCATGCTATACTGTTTTTGGCTTGGGGTCTATACACCCCCGCTTCC 
1321 t + + + + f 1330 

G6AhCCGAAGhATACGThC6ATATGACAAAAACCGAA.CCCCAG^TATGTGGGGGCGAAGG 

E 

s ' 
p 

1 

TCATGTTATAGGTGhTGGTATAGCTTAGCCTATAGGTGTGGGTTATTGACCATTATTGAC 
1331 + + + : + + 1440 

AGTACAATATCCACTACCATATCGAATCGGATATCCACACCCAATAACT6GTAATAACTG 

P 
f 

" ' 1 • • 

M 

. • • 1 

CAC7CCCCTATTGGTGACGATACTTTCCATTACTAATCCATAACATGGCTCTTTGCCACA 

li41 + . + + 1 + + 1500 

GTGAGGGGATA A.CCACTGCTATGA AAGGTAATGATTAGGTATTGTACCGAG AAACGGTGT 

E - • ... . 

c 

o 

5 

7 

ACTCTCTTTATTGGCTATATGCCAATACACTGTCCTTCAGAGACTGACACGGACTCTGTA 

1501 + + + + — + + 1560 

TGAGAGAAATAACCGATATACGGTTATGTGACAGGAA6TCTCTGACTGTGCCTGAGACAT 

E 
C 
o 
3 
1 

TTTTTACAGGmTGGGGTCTCATTTATTATTTACAAATTCACATATACAACACCACCGTCC 

15ol f + + + f + 1620 

AAAAAToTCCTACCCCAGAGTAAATAATAAATGTTTAAGTGTATATGTTGTGGTGGCAGG 

B 

. s X A A 

p ' h v f 

1 o .3 1 

2 2 13 
CCAGTGCCCGCmGTTTTTATTAAmChTAACGTGGGATCTCCACGCGAATCTCGGGTACGT 

1621 t + + -f t + 16S0 

GGTCACGGGCGTCAAAAATAATTTGThTTGCACCCThGAGGTGCGCTTAGAGCCCATGCA 
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* & g 

*• sp a ,.. 

M n 1 n i 

2 22 f-> 

GTTCCGGHCAT6GGC7CTTCTCC6GTAGCfiGCGGAGCTTCTACATCCGAGCCCTGCTCCC 
1681 -r x + + - — + 4 i7 4 o 

CAAGGCCTGTACCCGAGAAGAGGCCATCGCCGCCTCGAAGATGTAGGCTCGGG.ACGAG6G 

G H 

3 

u e 
1 1 
ATGCCTCCAGCGACTCATGGTCGCTCGGCAGCTCC TTGCTCCTAACAGTGGAGGCCAGAC 
1741 f + + +— + + 1300 

TACGGAGGTCGCTGAGTACCAGCGAGCCGTCGAGGAACGAGGATTGTCACCTCCGGTCTG 

If 
s 

3 

1 

ttaggcacagcacgatgcccaccaccaccagtgtgccgcacaaggccgtggcggtagggt 
isoi + + + + + — __ + 1860 

aatccgtgtcgtgctacgggt3gtggtggtcacacggcgtgttccggcaccgccatccca' ' 

BH N 

ABsdS s . A B 

vapis p f b 

■ sni Ac B 1 v 

12211 2 2 2 

/// ... ... 

ATGTGTCTGAAAATGAGCTCGGG6AGCGGGCTT6CACCGCTGACGCATTTGGAAGACTTA 
1361 + + + + __ + +1920 

TACACAGACTTTTACTCGAGCCCCTCGCCCGAACGTGGCGACTGCGTAAACCTTCTGAAT 

N . n 

s sP 

f PV 

B Bu 

2 "3 *? 

/ 

AGGCAGCGGCAGAAGAAGATGCAGGCAGCTGAGTTGTTGTGTTCTGATAAGAGTCAGAGG 
1921 + + + + + _ + 1930 

TCCGTCGCCGTCTTCTTCTACGTCCGTCGACTCAACAACACAAGACTATTCTCAGTCTCC 

H 

iH S 

np e 
ca " 3 



1 1 



/ 

TAhCTCCCGTTGCGGTGCTGTTAACGGTGGAGGGCAGTGTAGTCTGAGCAGTACTCGTTG 
1981 + t--- + + + + 2040 

ATT3AGG6CAACGCCACGACAATTGCCACCTCCCGTCACATCAGACTCGTCATGAGCAAC 
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s s DNS 
s a set 

H n 5 0« 

2 2 111 

// 

CTGCCGCGCGCGCCACCAGACA7AA7AGCTGACAGACTAACAGAC7G77CC7T7CCATGG 

2041 + + — + + + .-+ 2100 

GACGGCGCGCGCGGTGGTCTGTATTATCGhCTGTCTGATTGTCTGACAAGGAA.AGGTACC 

E 

P HNS .Bs 

S SCt 3.* 

t aoy nl 

1 111 22 

// / 
GTCTTTTCTGChGTCACCGTCCTTGACACCATGGCCCCCTTTGAGCCCCTGGCTTCTGGC 
2101 + + + + + + 2160 

CAGAAAAGACGTCAGTGGCAGGAACTGTGGTACCGGGGGAAACTCGGGGACCGAAGACCG 

D 
p 

a 

ATCCTGTTGTTGCTGTGGCTGATAGCCCCCAGCAGGGCCTGCACCTGTGTCCCACCCCAC '. ' 

2161 + f + + + ---T-----+ '2220 

7AGGACAACAACGACACCGAC7ATCGGGGGTCG7CCCGGACG7GGACACAGGGTGGGG7G "". 

T B 
t Ms 
h mt . . 

3 • eX • ... • : . 

1 11 
CCACAGACGGCCTTCTGCAATTCCGACCTCGTCATCAGGGCCAAGTTCGTGGGGACACCA 

2221 + + + + + +' 2280 

GGTGTCTGCCGGAAGACGTTAAGGCTGGAGCAGTAGTCCCGGTTCAAGCACCCCTGTGGT 

K 
i 
n 

c 

GAAGTCAACCAGACCACCTTATACCAGCGTTATGAGATCAAGATGACCAAGATGTATAAA 

2231 + + + + + f 2340 

CTTCAGTTG57CTGGTGGAATATGGTCGCAATACTCTAGTTCTACT6GTTCTACATATTT 

N 

rl • s A DNS 

s f c set 

t- B c aoy 

2 2 1 111 

// 

GGG7TCCAAGCCT7AGGGGATGCCGCTGACATCCGGTTCGTCTACACCCCCGCCATGGAG 

2341 + t -f + -- + + 240C- 

CCCAAGG77CG>3AATCCCC7mCGGCGACTG7AGGCCAAGCAGATGTGGGi;-GCGGTACC7C 
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i-ij 
sM 
21 
/ 

AGTGTCTGCGGATACTrCCACAGGTCCCACAACC6CAGCGAGGAGTTTCTCATT6CT6GA 

2401 + + + + + + 2460 

TCACAGACGCCTATGAAGGTGTCCAGGGTGTTGGCGTCGCTCCTCAAAGiiGTAACGACCT 

E 

P , P s 

S S P 

t t H 

1 11 
AAACTCCAGGATGGACTCTTGCACATCACTACCTBCAOTTTCGTGQCTCCCTGQAACAeC 

24ol + + + + + + 2520 

TTTGACGTCCTACCTGAGAACGTGTAGTGATGGACGTCAAAGCACCGAGGGACCTTGTCG 

EE H 

p p e 
11 2 

CTGAGCTTAGCTCAGCGCCGGGGCTTCACCAAGACCTACACTGTTGGCTGTGAGGAhTGC 

2521 + + + + + + 2580 

GACTCGAATCGAGTCGCGGCCCCGAAGTGGTTCTG6AT6TGACAACCGACACTCCTTACG 

T 

B P . t 

s s h 

m t 3 

1 1 2 

ACAGTGTTTCCCTGTTTATCCATCCCCTGCAAACTGCAGAGTGGCACTCATTGCTTGTGG 
TGTCAChAAGGGACAAATAGGTAGGGGACGTTTGACGTCTCACCGTGAGTAACGAACACC 

s 
t 

y 
1 

ACGGACCAGCTCCTCCAAGGCTCTGAAAAGGGCTTCCAGTCCCGTCACCTTGCCTGCCTG 

2641 + + + + f -+ 2700 

TGCCTGGTCGAGGhGGTTCCGAGACTTTTCCCGAAGGTCAGGGCAGTGGhACGGACGGhC 

A BH B 
A p SS s 

V 3 PI P 

3 L 1 A M 

1 1 21 - 2 

/ 

CCTCGGGA6CCA&GGC7GT6CACCTGGCAGTCCCTGCGGTCCCAGATAGCCTGAATCCG6 

2701 t r + + + . + 2760 

GGmGCCCTCGGTCCCGACmCG.TGGACCGTCAGGGACGCCAGGGTCTATCGGACTTAGuCC 



• <LKJ 1 1 uu • ^.u - -to 



uc 11 e v a 



r 

• 1 

ATCftTAHTCfiGCCATACCACATTTGTACAGGTTTTACTTGCTtTfiAAAAACCTCCCACAC 
<., 7 il — — + .j. - — |- + + — + 2S20 

TAGTATTAGTC&GTATGGTGTAAACATCTCCAAAATGAhCGAAATTTTTTGGAGGGTGTG 



H 

B iH 
a' n»» 
m cs 
1 21 
/ 

CTCCCCCTGAACCTGAAACATAAAATGAATGCAATTGTTGTTGTTAACTTGTTTATTGCA 

2821 r + + + + + 2380 

GAGGGGGACTTGGACTTTGTATTTTACTTACGTTAACAACAACAATTGAACAAATAACGT 

GCTTATA ATGGTT ACAAATAAAGCAATAGCATC ACAA ATTTCACAAATAAAGCATTTTTT 
2SS1 + 1 + + f + 2940 

CGAATATTACCAATGTTTATTTCGTTATC6TAGTGTTTAAAGTGTTTATTTCGTAAAAAA 

B 

B sX 
s mh 
«' Ho 
1 12 

/ 

TCACTGCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCTGGATC 

2941 + ----. + + + + + 3000 

AGTGACGTAAGATCAACACCAAACAGGTTTGAGTAGTTACATAGAATAGTACAGACCTAG 

C 

3001 - 3001 
G 

Enzymes that do cut: 



A3t2 


Aecl 


Af 12 


Af 13 


Ahs2 


ApaLl 


Aval 


Ball 


B»«iHl 


Banl 


Bar.2 


Bbv2 


' Ball 


Bsml 


Bsp12 


BspMI- 


BspM2 


BssH2 


BstXl 


Cf rl 


Clal 


Ural 


Drs2 


Dsei 


Ecool 


Eco57 


EcoRV 


EspI 


Gdi2 


Gsul 


Hsel 


Hae2 


HsiAl 


,nc2 


HpsI 


Mlul 


Mm el 


Mst2 


Ncol 


Ndel 


Nsil 


NspB2 


NspHI 


Pf 1M1 


FpuMI 


Pstl 


Pvu2 


Seel 


Sac 2 


Seal 


SnaBl 


Sf>el 


Sphl 


SspI 


Stwl 


Tth31 


Tth32 


Xho2 


X(Ti33 

















Enzymes that do noi cuti 



Apal Asu2. - Avr2 Bell 
EcoRl Fs?i H s i E 2 Hind3 
Pvul Rsr2 Sail Sfil 



B212 BspHI BstE2 CfrlO Prs3 EcoB 
Kpnl Nael Nsrl Nhel Notl Nrul 
Smsl SpII Stul Xhal Xhol Xmnl 



EcoK 



T 

DNS D f 

, r . P t AM 

sct s h fl 

a ° Y t 3 1,, 

// 31 

CCATGGTGTCAAGGACGGTGACTGCAGTGAATAATAAAATGTGTGTTTGTCCGAAAT ACG 
1 + + + + + 

GGTACCACAGTTCCTGCCACTGACGTCACTTATTATTTTACACACAAACAGGCTTTATGC 

CGTTTTGAGATTTCTGTCGCCGACTAAATTCATGTCGCGCGATAGTGGTGTTTATCGCCG 
61 + + + + + 

GCAAAACTCTAAAGACAGCGGCTGATTTAAGTACAGCGCGCTATCACCACAAATAGCGGC 

C 
1 
a 
1 

ATAGAGATGGCGATATTGGAAAAATCGATATTTGAAAATATGGCATATTGAAAATGTCGC 
121 + + + + + 

TATCTCTACCGCTATAACCTTTTTAGCTATAAACTTTTATACCGTATAACTTTTACAGCG 

E 
c 
o 
R 
V 

CGATGTGAGTTTCTGTGTAACTGATATCGCCATTTTTCCAAAAGTGATTTTTGGGCATAC 
181 + + + + + + 

GCTACACTCAAAGACACATTGACTATAGCGGTAAAAAGGTTTTCACTAAAAACCCGTATG 

E 
c 
o 
R 
V 

GCGATATCTGGCGATAGCGGCTTATATCGTTTACGGGGGATGGCGATAGACGACTTTGGT 
241 + + + + + + 30Q 

CGCTATAGACCGCTATCGCCGAATATAGCAAATGCCCCCTACCGCTATCTGCTGAAACCA 

GACTTGGGCGATTCTGTGTGTCGCAAATATCGCAGTTTCGATATAGGTGACAGACGATAT 
301 + + + + + + 36Q 

CTGAACCCGCTAAGACACACAGCGTTTATAGCGTCAAAGCTATATCCACTGTCTGCTATA 

C BH N C 

f aa si 
r le i a 

1 11 11 

GAGGCTATATCGCCGATAGAGGCGACATCAAGCTGGCACATGGCCAATGCATATCGATCT 
361 + + + + + + 42Q 

CTCCGATATAGCGGCTATCTCCGCTGTAGTTCGACCGTGTACCGGTTACGTATAGCTAGA 



Fig. 4A 



S C BH 
s f aa 
p r le 
1 1 11 
/ 

ATACATTGAATCAATATTGGCCATTAGCCATATTATTCATTGGTTATATAGCATAAATCA 
421 + + + + + + 48Q 

TATGTAACTTAGTTATAACCGGTAATCGGTATAATAAGTAACCAATATATCGTATTTAGT 
S C BH 

s f aa 

P r le 

1 1 11 

/ 

ATATTGGCTATTGGCCATTGCATACGTTGTATCCATATCATAATATGTACATTTATATTG 
481 + + + + + + 54Q 

TATAACCGATAACCGGTAACGTATGCAACATAGGTATAGTATTATACATGTAAATATAAC 

H 

1 M S 
n m p 
c e e 

2 1 1 

GCTCATGTCCAACATTACCGCCATGTTGACATTGATTATTGACTAGTTATTAATAGTAAT 
541 + + + + + + 60Q 

CGAGTACAGGTTGTAATGGCGGTACAACTGTAACTAATAACTGATCAATAATTATCATTA 



CAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGG 
+ + --- + + + + 

GTTAATGCCCCAGTAATCAAGTATCGGGTATATACCTCAAGGCGCAATGTATTGAATGCC 



B A A 
9 ha 
1 a t 
1 2 2 
TAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGT 
661 + + + + + + 720 

ATTTACCGGGCGGACCGACTGGCGGGTTGCTGGGGGCGGGTAACTGCAGTTATTACTGCA 

A A 
h a 
a t 
2 2 

ATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTAC 
721 + + + + + + 780 

TACAAGGGTATCATTGCGGTTATCCCTGAAAGGTAACTGCAGTTACCCACCTCATAAATG 
B N 

g d 

1 e 

1 1 

GGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTG 
781 + + + + + + 8 4o 

CCATTTGACGGGTGAACCGTCATGTAGTTCACATAGTATACGGTTCATGCGGGGGATAAC 



Fig.4B 



A A B 
ha g 
at i 
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ACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTATGGGACT 
841 + + + -- + + 900 

TGCAGTTACTGCCATTTACCGGGCGGACCGTAATACGGGTCATGTACTGGAATACCCTGA 

S 

n DNS 
a set 
B aoy 
1 111 

// 

TTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTT 
+ + + + + + 96Q 

AAGGATGAACCGTCATGTAGATGCATAATCAGTAGCGATAATGGTACCACTACGCCAAAA 



901 



GGCAGTACATCAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACC 
961 + + + + + + 1020 

CCGTCATGTAGTTACCCGCACCTATCGCCAAACTGAGTGCCCCTAAAGGTTCAGAGGTGG 

A A B 

ha a 

at n 

2 2 1 

CCATTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTC 
1021 4- + + + + + 1080 

GGTAACTGCAGTTACCCTCAAACAAAACCGTGGTTTTAGTTGCCCTGAAAGGTTTTACAG 

GTAACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATA 
1081 + + + + + + 1140 

CATTGTTGAGGCGGGGTAACTGCGTTTACCCGCCATCCGCACATGCCACCCTCCAGATAT 
BH 

BssS G A 

apia s h 

nlAc u a 

2211 1 2 

/// 

TAAGCAGAGCTCGTTTAGTGAACCGTCAGATCGCCTGGAGACGCCATCCACGCTGTTTTG 

1141 + + + + + + 1200 

ATTCGTCTCGAGCAAATCACTTGGCAGTCTAGCGGACCTCTGCGGTAGGTGCGACAAAAC 

N 

B D BCGsSX 

b s gfdpam 

v a IriBca 

2 1 112223 

//// 

ACCTCCATAGAAGACACCGGGACCGATCCAGCCTCCGCGGCCGGGAACGGTGCATTGGAA 

1201 + + + + + + 1260 

TGGAGGTATCTTCTGTGGCCCTGGCTAGGTCGGAGGCGCCGGCCCTTGCCACGTAACCTT 



Fig.4C 



CGCGGATTCCCCGTGCCAAGAGTGACGTAAGTACCGCCTATAGAGTCTATAGGCCCACCC 

1261 + + + + + + 

GCGCCTAAGGGGCACGGTTCTCACTGCATTCATGGCGGATATCTCAGATATCCGGGTGGG 

B N 
Ss N sS 

tt s pp 

yX i Hh 

11 1 11 

/ 

ccttggcttcttatgcatgctatactgtttttggcttggggtctataca£ccccgcttcc 

1321 + + + + + + 1330 

GGAACCGAAGAATACGTACGATATGACAAAAACCGAACCCCAGATATGTGGGGGCGAAGG 

E 

s 

P 
1 

TCATGTTATAGGTGATGGTATAGCTTAGCCTATAGGTGTGGGTTATTGACCATTATTGA^ 
1381 + + + + + + 144Q 

AGTACAATATCCACTACCATATGGAATCGGATATCCACACCCAATAACTGGTAATAACTG 

P 
f 
1 
M 
1 

CACTCCCCTATTGGTGACGATACTTTCCATTACTAATCCATAACATGGCTCTTTGCCACA 
1441- — . + + + + + + 150() 

GTGAGGGGATAACCACTGCTATGAAAGGTAATGATTAGGTATTGTACCGAGAAACGGTGT 

E 
c 
o 
5 
7 

ACTCTCTTTATTGGCTATATGCCAATACACTGTCCTTCAGAGACTGACACGGACTCTGTA 
1501 + + + + + + 15g0 

TGAGAGAAATAACCGATATACGGTTATGTGACAGGAAGTCTCTGACTGTGCCTGAGACAT 

E 
c 
o 
3 
1 

TTTTTACAGGATGGGGTCTCATTTATTATTTACAAATTCACATATACAACACCACCGTCC 
1561- + + + + + + 162Q 

AAAAATGTCCTACCCCAGAGTAAATAATAAATGTTTAAGTGTATATGTTGTGGTGGCAGG 

B 

s X A A 

P h v f 

1 o a 1 

2 2 13 

CCAGTGCCCGCAGTTTTTATTAAACATAACGTGGGATCTCCACGCGAATCTCGGGTACGT 
1621 + + + + + 168Q 

GGTCACGGGCGTCAAAAATAATTTGTATTGCACCCTAGAGGTGCGCTTAGAGCCCATGCA 

Fig.AD 



6 B B 

% Bs 

P a P 3D 

M nl H 

CAAGGCCTGTACCCGAGAAGAGGCCATCGCCGCCTCGAAGATGTAGGGTCG^ 1?4 ° 
I 

u a 

1741 a !^tccagcgactc^ 

TACGGAGGTCGCTGAGTACCAGCGAGCCGTCGAGGAACGAGGATTGTCACCTCCGGTCT^ 1800 

D 

s 
a 

TTAGGCACAGCACGATGCGCACCACCACCAGTGTGCCGCACAAGGCCGTGGCGGTAGGGT 
loul 1 1 ^ | 

AATCCGTGTCGTGCTACGGGTGGTGGTGGTCACACGGCGT^ 1860 

BH N 

ABsgS s a B 

vapia p f b 

anlAc b i v 

12211 2 y ? 

/// 

ATGTGTCTGAAAATGAGCTCGGGGAGCGGGCTTGCACCGCTGACGCATTTGGAAGACTTA 

lOO 1 1 1 y j ^ 

TACACAGACTTTTACTCGAGCCCCTCGCCCGAACGTGGCGACTGCGTAAACCTTCTGAAT 1920 

N N 

s sP 

P pv 

B Bu 

2 22 

.... AGGCAGCGGCAGAAGAAGATGCAGGCAGCTGAGTTGTTGTGTTCTGATAAGAGTCAGAGG 
19 21 1 j 1 | 

TCCGTCGCCGTCTTCTTCTACGTCCGTCGACTCAACAACACAAGACTATTCTCAGTCTCC 198 ° 

H 

iH s 
np c 



/ 

TAACTCCCGTTGCGGTGCTGTTAACGGTGGAGGGCAGTGTAGTCTGAGCAGTACTCGTTS 
1981 + — - + + + + 

ATTGAGGGCAACGCCACGACAATTGCCACCTCCCGTCACATCAGACTCGTCATGAGCAAC 



ca a 
21 : 
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CTGCCGCGCGCGCCACCAGACATAATAGCTGACAGACTAACAC- ^ ' U 

: TGAC AAGGAAAGGTACC 
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GACGGCGCGCGCGGTGGTCTGTAiTAicGlciGicTGA^GT^"""^ ~ + 2 
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^^GTCTpTCTGCAGTCACCGTCCTTGACjScCATG; 
CAGAAAAGACGTCAGTGGCAGGAACTGTG +I 



Fig.AF 
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